136 research outputs found

    Multigranular scale speech recognition: tehnological and cognitive view

    Get PDF
    We propose a Multigranular Automatic Speech Recognizer. The hypothesis is that speech signal contains information distributed on more different time scales. Many works from various scientific fields ranging from neurobiology to speech technologies, seem to concord on this assumption. In a broad sense, it seems that speech recognition in human is optimal because of a partial parallelization process according to which the left-to-right stream of speech is captured in a multilevel grid in which several linguistic analyses take place contemporarily. Our investigation aims, in this view, to apply these new ideas to the project of more robust and efficient recognizers

    Syllable classification using static matrices and prosodic features

    Get PDF
    In this paper we explore the usefulness of prosodic features for syllable classification. In order to do this, we represent the syllable as a static analysis unit such that its acoustic-temporal dynamics could be merged into a set of features that the SVM classifier will consider as a whole. In the first part of our experiment we used MFCC as features for classification, obtaining a maximum accuracy of 86.66%. The second part of our study tests whether the prosodic information is complementary to the cepstral information for syllable classification. The results obtained show that combining the two types of information does improve the classification, but further analysis is necessary for a more successful combination of the two types of features

    Silent pauses as clarification trigger

    Get PDF
    Among possible pragmatic feedback an interlocutor can use to acknowledge the degree of understanding of an utterance, clarification requests (CRs) are to be considered. The functional role of CRs can furthermore be expressed via silent pauses - or failed turn-giving moves - which express an understanding problem and are solved through a clarify speech act. In this work, we therefore hypothesise that some silent pauses, in specific conditions, may also have an interactional role which is interpreted by the speaker as a clarification need

    Machine Learning of Probabilistic Phonological Pronunciation Rules from the Italian CLIPS Corpus

    Get PDF
    A blending of phonological concepts and technical analysis is proposed to yield a better modeling and understanding of phonological processes. Based on the manual segmentation and labeling of the Italian CLIPS corpus we automatically derive a probabilistic set of phonological pronunciation rules: a new alignment technique is used to map the phonological form of spontaneous sentences onto the phonetic surface form. A machine-learning algorithm then calculates a set of phonologi- cal replacement rules together with their conditional probabilities. A critical analysis of the resulting probabilistic rule set is presented and discussed with regard to regional Italian accents. The rule set presented here is also applied in the newly published web-service WebMAUS that allows a user to segment and phonetically label Italian speech via a simple web-interface
    • …
    corecore